376 research outputs found

    Automatic Spectroscopic Data Categorization by Clustering Analysis (ASCLAN): A Data-Driven Approach for Distinguishing Discriminatory Metabolites for Phenotypic Subclasses

    Get PDF
    We propose a novel data-driven approach aiming to reliably distinguish discriminatory metabolites from nondiscriminatory metabolites for a given spectroscopic data set containing two biological phenotypic subclasses. The automatic spectroscopic data categorization by clustering analysis (ASCLAN) algorithm aims to categorize spectral variables within a data set into three clusters corresponding to noise, nondiscriminatory and discriminatory metabolites regions. This is achieved by clustering each spectral variable based on the r(2) value representing the loading weight of each spectral variable as extracted from a orthogonal partial least-squares discriminant (OPLS-DA) model of the data set. The variables are ranked according to r(2) values and a series of principal component analysis (PCA) models are then built for subsets of these spectral data corresponding to ranges of r(2) values. The Q(2)X value for each PCA model is extracted. K-means clustering is then applied to the Q(2)X values to generate two clusters based on minimum Euclidean distance criterion. The cluster consisting of lower Q(2)X values is deemed devoid of metabolic information (noise), while the cluster consists of higher Q(2)X values is then further subclustered into two groups based on the r(2) values. We considered the cluster with high Q(2)X but low r(2) values as nondiscriminatory, while the cluster with high Q(2)X and r(2) values as discriminatory variables. The boundaries between these three clusters of spectral variables, on the basis of the r(2) values were considered as the cut off values for defining the noise, nondiscriminatory and discriminatory variables. We evaluated the ASCLAN algorithm using six simulated (1)H NMR spectroscopic data sets representing small, medium and large data sets (N = 50, 500, and 1000 samples per group, respectively), each with a reduced and full resolution set of variables (0.005 and 0.0005 ppm, respectively). ASCLAN correctly identified all discriminatory metabolites and showed zero false positive (100% specificity and positive predictive value) irrespective of the spectral resolution or the sample size in all six simulated data sets. This error rate was found to be superior to existing methods for ascertaining feature significance: univariate t test by Bonferroni correction (up to 10% false positive rate), Benjamini-Hochberg correction (up to 35% false positive rate) and metabolome wide significance level (MWSL, up to 0.4% false positive rate), as well as by various OPLS-DA parameters: variable importance to projection, (up to 15% false positive rate), loading coefficients (up to 35% false positive rate), and regression coefficients (up to 39% false positive rate). The application of ASCLAN was further exemplified using a widely investigated renal toxin, mercury II chloride (HgCl2) in rat model. ASCLAN successfully identified many of the known metabolites related to renal toxicity such as increased excretion of urinary creatinine, and different amino acids. The ASCLAN algorithm provides a framework for reliably differentiating discriminatory metabolites from nondiscriminatory metabolites in a biological data set without the need to set an arbitrary cut off value as applied to some of the conventional methods. This offers significant advantages over existing methods and the possibility for automation of high-throughput screening in "omics" data

    K-OPLS package: Kernel-based orthogonal projections to latent structures for prediction and interpretation in feature space

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Kernel-based classification and regression methods have been successfully applied to modelling a wide variety of biological data. The Kernel-based Orthogonal Projections to Latent Structures (K-OPLS) method offers unique properties facilitating separate modelling of predictive variation and structured noise in the feature space. While providing prediction results similar to other kernel-based methods, K-OPLS features enhanced interpretational capabilities; allowing detection of unanticipated systematic variation in the data such as instrumental drift, batch variability or unexpected biological variation.</p> <p>Results</p> <p>We demonstrate an implementation of the K-OPLS algorithm for MATLAB and R, licensed under the GNU GPL and available at <url>http://www.sourceforge.net/projects/kopls/</url>. The package includes essential functionality and documentation for model evaluation (using cross-validation), training and prediction of future samples. Incorporated is also a set of diagnostic tools and plot functions to simplify the visualisation of data, e.g. for detecting trends or for identification of outlying samples. The utility of the software package is demonstrated by means of a metabolic profiling data set from a biological study of hybrid aspen.</p> <p>Conclusion</p> <p>The properties of the K-OPLS method are well suited for analysis of biological data, which in conjunction with the availability of the outlined open-source package provides a comprehensive solution for kernel-based analysis in bioinformatics applications.</p

    Hippurate: the natural history of a mammalian-microbial co-metabolite

    Get PDF
    Hippurate, the glycine conjugate of benzoic acid, is a normal constituent of the endogenous urinary metabolite profile and has long been associated with the microbial degradation of certain dietary components, hepatic function and toluene exposure, and is also commonly used as a measure of renal clearance. Here we discuss the potential relevance of hippurate excretion with regards to normal endogenous metabolism and trends in excretion relating to gender, age, and the intestinal microbiota. Additionally, the significance of hippurate excretion with regards to disease states including obesity, diabetes, gastrointestinal diseases, impaired renal function, psychological disorders and autism, as well as toxicity and parasitic infection, are considered

    Integrated Metabonomic-Proteomic Analysis of an Insect-Bacterial Symbiotic System

    Get PDF
    The health of animals, including humans, is dependent on their resident microbiota, but the complexity of the microbial communities makes these associations difficult to study in most animals. Exceptionally, the microbiology of the pea aphid Acyrthosiphon pisum is dominated by a single bacterium Buchnera aphidicola (B. aphidicola). A 1H NMR-based metabonomic strategy was applied to investigate metabolic profiles of aphids fed on a low essential amino acid diet and treated by antibiotic to eliminate B. aphidicola. In addition, differential gel electrophoresis (DIGE) with mass spectrometry was utilized to determine the alterations of proteins induced by these treatments. We found that these perturbations resulted in significant changes to the abundance of 15 metabolites and 238 proteins. Ten (67%) of the metabolites with altered abundance were amino acids, with nonessential amino acids increased and essential amino acids decreased by both perturbations. Over-represented proteins in the perturbed treatments included catabolic enzymes with roles in amino acid degradation and glycolysis, various cuticular proteins, and a C-type lectin and regucalcin with candidate defensive roles. This analysis demonstrates the central role of essential amino acid production in the relationship and identifies candidate proteins and processes underpinning the function and persistence of the association

    Gut microbiota modulation of chemotherapy efficacy and toxicity

    Get PDF
    Evidence is growing that the gut microbiota modulates the host response to chemotherapeutic drugs, with three main clinical outcomes: facilitation of drug efficacy; abrogation and compromise of anticancer effects; and mediation of toxicity. The implication is that gut microbiota are critical to the development of personalized cancer treatment strategies and, therefore, a greater insight into prokaryotic co-metabolism of chemotherapeutic drugs is now required. This thinking is based on evidence from human, animal and in vitro studies that gut bacteria are intimately linked to the pharmacological effects of chemotherapies (5-fluorouracil, cyclophosphamide, irinotecan, oxaliplatin, gemcitabine, methotrexate) and novel targeted immunotherapies such as anti-PD-L1 and anti-CLTA-4 therapies. The gut microbiota modulate these agents through key mechanisms, structured as the 'TIMER' mechanistic framework: Translocation, Immunomodulation, Metabolism, Enzymatic degradation, and Reduced diversity and ecological variation. The gut microbiota can now, therefore, be targeted to improve efficacy and reduce the toxicity of current chemotherapy agents. In this Review, we outline the implications of pharmacomicrobiomics in cancer therapeutics and define how the microbiota might be modified in clinical practice to improve efficacy and reduce the toxic burden of these compounds

    New methodology for known metabolite identification in metabonomics / metabolomics: topological metabolite identification carbon efficiency (tMICE)

    Get PDF
    A new, simple-to-implement and quantitative approach to assessing the confidence in NMR-based identification of known metabolites is introduced. The approach is based on a topological analysis of metabolite identification information available from NMR spectroscopy studies and is a development of the metabolite identification carbon efficiency (MICE) method. New topological metabolite identification indices are introduced, analysed and proposed for general use, including topological metabolite identification carbon efficiency (tMICE). Since known metabolite identification is one of the key bottlenecks in either NMR spectroscopy- or mass spectrometry-based metabonomics/metabolomics studies, and given the fact that there is no current consensus on how to assess metabolite identification confidence, it is hoped that these new approaches and the topological indices will find utility

    A comparison of collision cross section values obtained via travelling wave ion mobility-mass spectrometry and ultra high performance liquid chromatography-ion mobility-mass spectrometry : application to the characterisation of metabolites in rat urine

    Get PDF
    A comprehensive Collision Cross Section (CCS) library was obtained via Travelling Wave Ion Guide mobility measurements through direct infusion (DI). The library consists of CCS and Mass Spectral (MS) data in negative and positive ElectroSpray Ionisation (ESI) mode for 463 and 479 endogenous metabolites, respectively. For both ionisation modes combined, TWCCSN2 data were obtained for 542 non-redundant metabolites. These data were acquired on two different ion mobility enabled orthogonal acceleration QToF MS systems in two different laboratories, with the majority of the resulting TWCCSN2 values (from detected compounds) found to be within 1% of one another. Validation of these results against two independent, external TWCCSN2 data sources and predicted TWCCSN2 values indicated to be within 1-2% of these other values. The same metabolites were then analysed using a rapid reversed-phase ultra (high) performance liquid chromatographic (U(H)PLC) separation combined with IM and MS (IM-MS) thus providing retention time (tr), m/z and TWCCSN2 values (with the latter compared with the DI-IM-MS data). Analytes for which TWCCSN2 values were obtained by U(H)PLC-IM-MS showed good agreement with the results obtained from DI-IM-MS. The repeatability of the TWCCSN2 values obtained for these metabolites on the different ion mobility QToF systems, using either DI or LC, encouraged the further evaluation of the U(H)PLC-IM-MS approach via the analysis of samples of rat urine, from control and methotrexate-treated animals, in order to assess the potential of the approach for metabolite identification and profiling in metabolic phenotyping studies. Based on the database derived from the standards 63 metabolites were identified in rat urine, using positive ESI, based on the combination of tr, TWCCSN2 and MS data.</p
    corecore